Goto

Collaborating Authors

 auxiliary causal graph discovery


CASTLE: Regularization via Auxiliary Causal Graph Discovery

Neural Information Processing Systems

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers.


Review for NeurIPS paper: CASTLE: Regularization via Auxiliary Causal Graph Discovery

Neural Information Processing Systems

Summary and Contributions: The aim of this paper is to improve performance of supervised learning on out-of-bag samples. In the case of deep networks, regularization helps mitigate overfit but does not exploit the structure of the feature variables and their relation to the outcome when the DGP can be represented by a causal DAG. The authors propose CASTLE, which jointly learns the causal graph while performing regularization. In particular, the adjacency matrix of the learned DAG is used in the input layers of neural network, which translates to the penalty function being decomposed into the reconstruction loss found in SAE, a (new) acyclicity loss, and a capacity-based regularizer of the adjacency matrices. Unlike other approaches, CASTLE improves upon capacity-based and auto-encoder-based regularization by exploiting the DAG structure for identification of causal predictors (parents of Y, if they exist) and for target selection for reconstruction regularization (features that have neighbours in the underlying DAG).


CASTLE: Regularization via Auxiliary Causal Graph Discovery

Neural Information Processing Systems

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.